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Abstract 

We adopt a game theoretic approach for the design and analysis of distributed resource allocation 
algorithms in fading multiple access channels. The users are assumed to be selfish, rational, and limited 
by average power constraints. We show that the sum-rate optimal point on the boundary of the multiple- 
access channel capacity region is the unique Nash Equilibrium of the corresponding water-filling game. 
This result sheds a new light on the opportunistic communication principle and argues for the fairness 
of the sum-rate optimal point, at least from a game theoretic perspective. The base-station is then 
introduced as a player interested in maximizing a weighted sum of the individual rates. We propose 
a Stackelberg formulation in which the base-station is the designated game leader. In this set-up, the 
base-station announces first its strategy defined as the decoding order of the different users, in the 
successive cancellation receiver, as a function of the channel state. In the second stage, the users 
compete conditioned on this particular decoding strategy. We show that this formulation allows for 
achieving all the corner points of the capacity region, in addition to the sum-rate optimal point. On the 
negative side, we prove the non-existence of a base-station strategy in this formulation that achieves the 
rest of the boundary points. To overcome this limitation, we present a repeated game approach which 
achieves the capacity region of the fading multiple access channel. Finally, we extend our study to 
vector channels highlighting interesting differences between this scenario and the scalar channel case. 


1 Introduction 

The design and analysis of efficient resource allocation algorithms for wireless channels has received 
significant research interest for many years. In a pioneering work, Tse and Hanly have characterized 
the capacity region of the fading multiple access channel and the corresponding optimal power and rate 
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allocation policies [3]. The centralized nature of these policies motivates our work here on the design 
and analysis of distributed allocation strategies that approach the optimal performance. Arguably, such 
distributed implementations are more desirable from a practical perspective. 

In this paper, we adopt a game theoretic framework where the users are typically modelled as rational 
and selfish players interested in maximizing the utilities they obtain from the network. The selfish behavior 
implies that individual users do not care about the overall system performance. Over the last ten years, 
game theoretic tools have been used to design distributed resource allocation strategies in a variety of 
contexts. For example, Mackenzie et al. consider the collision channel [11], Yu et al. focus on the digital 
subscriber line setup [12], Etkin et al. investigate the power allocation game in the Gaussian interference 
channel [13], and La et al. model the power control problem in Gaussian multiple access channels as a 
cooperative game where the users are allowed to form coalitions [10]. Probably the scenario closest to 
our work is the design of distributed power control algorithms for the up-link of Code Division Multiple 
Access (CDMA) systems considered in e.g., [4-9]. These papers focus on time-invariant channels and 
construct utility functions that allow the users to reach a socially optimal equilibrium. These works, 
however, reach the negative conclusion that the selfish behavior entails a fundamental performance loss 
in the sense that the achievable utilities at the equilibria points 1 , if they exist, are usually inefficient as 
compared with the centralized policy [4, 8]. The central contribution of this paper is showing how to 
overcome this negative conclusion in fading channels by exploiting the time varying nature of fading, 
modelling the base-station as an additional player with the appropriate decoding strategy, and resorting to 
a repeated game formulation if needed. 

We start with a static Nash formulation which only models the multiple access users as players. In 
this formulation, every player treats the signals of other users as Gaussian noise (with the appropriate 
variance) and is interested in maximizing its achievable rate subject to an average power constraint. The 
static nature of the game implies that the game is played only once, and not a fixed channel environments. 
In this scenario, the optimal power allocation strategy of every player is given by the water-filling response 
to other players’ strategies. Remarkably, we show that the unique Nash equilibrium of this water-filling 
game is the sum-rate optimal point on the boundary of the capacity region [3]. In a sense, this result 
establishes the fairness of the sum-rate point, at least from a game theoretic perspective. Hoping to achieve 
other boundary points of the capacity region, we then introduce the base-station as a player interested in 
maximizing a weighted sum of the individual rates. By allowing the base-station to announce its decoding 
strategy first, we transform our game into a Stackelberg formulation [18]. Here, we establish the ability 
of this approach to achieve all the comer points of the capacity region in addition to the sum-rate optimal 
point. The key idea is for the base-station to use a successive decoding strategy while altering the decoding 
order as a function of the channel state. The final step, that allows for achieving all points on the boundary 
of the capacity region, is to use a dynamic game approach. In this set-up, the base-station can use the 
decoding order as a punishment tool forcing the multiple access users to adopt the optimal power control 
'The rigorous definition of equilibria points will be given in the sequel. 
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policies. We then extend our results to vector channels where different conclusions (as compared to the 
scalar case) are drawn. It is worth noting that our approach is purely information theoretic, and hence, we 
do not introduce other elements such as pricing mechanisms [4] into the problem. In particular, we limit 
the payoff functions to depend only on the achievable rate(s), and define the multiple access user strategy 
as a power/rate allocation policy and the base-station strategy as a decoding algorithm. 

The rest of the paper is organized as follows. In Section |2| we present the system model and review, 
briefly, known results on the capacity of fading multiple access channels. Section 0 includes our results 
on the water-filling game for scalar fading channels. In particular, we devote Section l3~TI to the Nash 
formulation, Section lT2l to the Stackelberg formulation, and Section l3~3l to the dynamic game scenario. 
Section E] highlights some interesting structural differences between scalar and vector channels. Finally, 
we close with some concluding remarks in Section 13 


2 Background 


We consider a discrete-time flat fading multiple access channel with N users and one base-station. The 
signal received by the base-station at time n is 2 

N 

y(n ) = X] Vhi(n)xi(7i) + z(n), (1) 

2—1 


where Xi(n) and h, (n) are the transmitted signal and fading channel gain of the ?th user at time n. Similar 
to [3], we assume the fading process to be jointly stationary and ergodic. We further assume that the 
stationary distribution has a continuous density and is bounded. User i has an average power constraint 
Pi and z(n) is a sample of a zero-mean white Gaussian noise process with variance cr 2 . The capacity 
region of this channel depends on the fading process characteristics and the availability of the channel 
state information (CSI). 

If the channel gains are assumed to be fixed and known a-priori (i.e., time invariant channel) then we 
are reduced to the Gaussian multiple-access channel where the capacity region is well known [1]. For the 
two user case, this region Q g is given by: 


R\ < \ log 2 (l + ,, 


R -2 < o ^°§2 ( 1 + 


Ri + R -2 < - log 2 ( 1 + 


CP 

h 2 p2 
“^ 2_ 
hiP] + h 2 P 2 


( 2 ) 


2 tn this paper, we use lower case letters for scalars, bold face lower case letters for vectors and bold face upper case letters 


for matrices. 
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Figure 1: The two-user multiple access channel. 

It is easy to see that the boundary of Q g is a pentagon. The two comer points are achieved by employing 
a successive decoding strategy at the base-station and other boundary points are achieved by appropriate 
time sharing between the two decoding strategies used at the corner points [1], For time-varying channels 
with only receiver CSI, the capacity region is also known [2]. For the two user case, the new capacity 
region can be interpreted as the average of the rate expressions in (Q) with respect to the fading channel 
distribution. 

In this paper, we consider time-varying channels where the CSI is available a-priori at all the trans¬ 
mitters and the receiver. This scenario was considered by Tse and Hanly [3] where they characterized the 
capacity region Q c along with the corresponding centralized power and rate allocation policies (V,,, 7 Z c ). 
It was also shown in [3] that the power and rate allocation policies are unique and each boundary point 
corresponds to the maximization of a weighted sum of the individual rates. All the boundary points are 
achieved by successive decoding, where the decoding order is determined by the rate award vector p [3]. 

The capacity region for the two user case is shown in Figure |2] The comer point CR\ is achieved by 
using the following policy: user 1 water-fills over the background noise level and user 2 water-fills over 
the sum of the interference from user 1 and the background noise. At the base-station user 2 is decoded 
first followed by user 1. We denote the rate pair at this point as (-Ri.ori, -R 2 ,ori)- At point CR 2 , the roles 
of users 1 and 2 are reversed and we refer to the rate pair by (Ri,cr 2 , R 2 ,cr 2 )- Another boundary point 
of particular interest is the maximum sum-rate point SP. Unlike the AWGN Multiple Access Channel 
(MAC), this point is unique in our case and is achieved by a time-sharing policy where only one user is 
allowed to transmit at any fading state [3,14]. This observation will prove instrumental to the development 
of the main result in Section [ATI 

The centralized nature of the optimal power and rate allocation policies ( V c , 1Z C ) motivates our pursuit 
for distributed strategies that approach the capacity region of the fading MAC. Our assumption that the 
CSI is known everywhere implies that the games considered here are games with perfect and complete 
information [4-13]. Without loss of generality, and to avoid some tedious details, we limit our discussion 
to pure strategies [18,19]. 
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Figure 2: The capacity region of the two user fading multiple access channel. 

3 The Water-Filling Game 

For simplicity of presentation, we first consider in details the two user scenario. Our arguments extend to 
the N user channel as briefly outlined in Section lT4l 


3.1 Nash Formulation 


Here, we consider a static non-cooperative game where the players are the multiple-access users. In this 
game, the strategy of user i is the power control policy V, and rate control policy TZ,. The corresponding 
payoff function is defined as the average achievable rate R, = Eh\V,\ with h = [hi, //, 2 1 7 - The goal of 
user i is to 


ma xRi(Vi,V-i) s.t. V t E T % , 


(3) 


where V, = {Vi : E h [Pi\ < P l ,V,{h) > 0} is the set of all feasible power control policies of user i, and 
V represents the power control policy of the other user (in the more general V-i refers to the strategies 
of all users except user i). Since the base-station is not a player of the game, we assume that each user will 
treat the signal of the other user as interference. Given the power control policy V 2 {h\, h 2 ) of user 2, the 
payoff of user 1 is given by 


Ri 




V\{h\, h-2)hi 
+ V 2 {h x , h 2 )h 2 


f(hi, h 2 )dhidh 2 . 


(4) 


Here f(hi, h 2 ) is the joint probability density function of the two fading coefficients. The payoff function 
of user 2 is defined similarly. As we can see the payoff function of each user depends on the two power 
control policies (Vi, V 2 ). Before proceeding further, we need the following definition from [19]. 
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Definition 1 A Nash equilibrium is a policy pair (V*,V |) such that 


Hn,n) > mKk), vp'iefu 

> r 2 (v;,v' 2 ), \/v 2 eP 2 . (5) 


This definition means that at the Nash equilibrium, no user can benefit by unilaterally deviating. Given 
a fixed power control policy of user 2, the optimal strategy V\ (h \, h 2 ) of user 1 is the solution to the 
following optimization problem 


Ri 

s.t. 


max 

Vi 




Vi(h\, h 2 )h\ 

<y 2 + V 2 {h\, h 2 )h 2 


f{hi, h 2 )dhidh 2 , 


J J 'Pi(hi,h 2 )f(hi,h 2 )dhidh 2 <Pi, 

Vi(hi, h 2 ) > 0 . 


( 6 ) 


We wish to emphasize the fact that each user is actually not aware of the policy used by the other user. 
Starting from an arbitrary initial point, each user can only rely on the assumption of rationality to guess 
the policy employed by the other user. Based on this guess, each user chooses a new policy as a best 
response to the conceived policy of the other user. This process is then repeated, hoping to converge at an 
equilibrium. One of the central themes in game theory is to characterize such equilibria, if they exist [18]. 

It is easy to verify that the objective function in © is concave, the constraint set is convex, the Slater’s 
condition is satisfied, and hence, the solution to this problem is the well-known water-filling power allo¬ 
cation, i.e., 


in which (x) + 


2 

'Pi (h \, h 2 ) = — —- 


V 2 (h 1 , h 2 )h 2 
hi 


max{z, 0} and Ai is the power level that satisfies 


(7) 



Ar--- 

h\ 


P 2 {hi, h 2 )h 2 


h\ 


f(hi,h 2 )dhidh 2 = P\. 


( 8 ) 


Similarly the optimal policy of user 2, given a fixed policy for user 1, is given by 


V 2 (hi, h 2 ) — ( A 2 — — 
u 2 


a 


Pi(hi, h 2 )h\ 

h 2 


(9) 


From these expressions, one can see that the optimal policy of each user depends largely on its guess of 
the other user policy. Based on this guess, each user will determine its policy and adjusts its water-filling 
level to maximize its own average rate. At the Nash equilibrium, the water-filling pair (Ai, A 2 ) satisfies 
the two average power constraints with equality. Now we are ready to prove our first result. 


Theorem 1 The maximum sum-rate point SP of the capacity region Q c is the unique Nash equilibrium of 
our water-filling game. 


6 






Proof : At first, let’s show the existence of only time-sharing equilibria. Suppose there exists a non time 
sharing equilibrium with the corresponding water-level pair (Ai, A 2 ). Then for some channel realizations 

hi, h 2 , we have Vi{hi, h 2 ) > 0, V 2 (hi, h 2 ) > 0, and 


a 

hi 


V 2 (hi, h 2 )h 2 
hi 


+ "Pi{hi, h 2 ) — Ai, 



From these two equations, we get 


Vi(hi, h 2 )hi 

h 2 


+ V 2 (hi,h 2 ) 


^2- 


( 10 ) 


Ai — -^27—• (11) 

hi 

Since Ai, A 2 are constants, and the fading coefficients are characterized by a continuous pdf, CD is 
satisfied with a zero probability. This implies the existence of only time-sharing Nash equilibria. 

Under the time-sharing equilibrium, when h 2 ) > 0, the sum of the background noise and the 

interference from user 1 should be larger than the water-level of user 2. Thus when user 1 transmits, the 
channel conditions should satisfy the following inequality 


hi 


0 


o 2 \ hi 


o 


Ai h 


Vl{hl '' i2) ^ + T 2 - ( Al “ YJh 2 + hi - ~hT - ^ 


- > Ao. 


( 12 ) 


Similarly, when user 2 transmits, the channel conditions should satisfy the following condition 


A 2^2 

hi 


> A, 


(13) 


The water-filling levels can now be obtained by solving the following two equations 




+ 

f(hi, h 2 )dhidh 2 

+ 

f(hi, h 2 )dhidh 2 


Pi, 

Pi- 


(14) 


The corresponding power control policies are unique and given by 

Vi(hi, h 2 ) = ^Ai - y) , when hi > (15) 

V 2 (hi, h 2 ) = ^A 2 - y) , when h 2 > (16) 

with Vi(hi,h 2 ) = 0 and V 2 {hi,h 2 ) = 0 in other cases. 

It was shown in [3] that centralized policy corresponding to the point SP is time sharing with the same 
power allocation levels as (fl5l) (l~i~6l ). Finally, the fact that the solution to (l~i~4l) is unique [3] implies that 
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the only Nash equilibrium of the distributed power control game is the maximum sum-rate point of the 
capacity region (i.e., SP). □ 

Two comments are now in order. 

1. Theorem HI establishes the remarkable fact that the selfish behavior of the users will lead them 
to jointly optimize the sum-rate of the channel. In fact, this result provides a new interpretation 
of the opportunistic communication principle [14]. At any particular instance, the user with the 
strongest channel sees a relatively weak interference from the other user, and hence, decides to 
transmit with a high power level. On the other hard, the other user sees a strong interferer in addition 
to a weak channel, and hence, decides to conserve the power for later usage. This way, they reach 
the opportunistic time sharing equilibrium distributively. This result also establishes a certain game- 
theoretic fairness of the point SP. The underlying idea is that the selfishness of the different users 
will balance-out at the sum-rate optimal point. To impose other fairness criteria, the base-station 
must be involved in the game as argued in the next section. 

2. Theorem [j] contrasts the negative conclusions drawn in earlier works on the efficiency of game 
theoretic approaches in CDMA up-link power control (e.g., [4-9]). The enabling vehicle behind 
this result is the time varying nature of the fading channel. With this temporal variations, the CSI 
(available at all transmitter) acts like a common randomness that allows the users to reach a more 
efficient equilibrium based on a selfish rationale. This is yet another manifestation of the positive 
impact that fading, if properly exploited, can have on certain aspects of wireless systems. 

3.2 Stackelberg Formulation 

In the previous section, we have shown that the only boundary point achievable by our Nash game is the 
optimal sum-rate point. One can attribute this limitation to the assumption that every user (player) will treat 
the other user’s signal as noise. While this assumption does not entail a loss at the time sharing point SP, it 
does not allow for achieving other boundary points. Such points require the base-station to employ a more 
sophisticated decoding rule. In [3], it was shown that successive decoding, with the appropriate ordering, 
is sufficient to achieve all the boundary points. This observation motivates a game theoretic formulation 
where the base-station is introduced as an additional player. The base-station strategy corresponds to a 
particular choice of the decoding order, as detailed next. 

We wish to stress that, unlike the centralized scenario [3], the base-station in our formulation does 
not dictate the power level and rate of the individual users. Still, it is reasonable to assume that the roles 
of the base-station and multiple-access users are not totally symmetric. Therefore, we do not model the 
base-station as an ordinary player in our game but rather appeal to the bi-level programming notion [15]. 
Bi-level programming is typically used in modelling a decision making process where there is a hierar¬ 
chical relationship between the decision makers. In our context, bi-level programming corresponds to a 


8 


Stackelberg game [15,19], where the leader announces its strategy first and then the remaining players 
react according to a specific equilibrium concept among them. Here, we designate the base-station as the 
game leader, and hence, it will announce its decoding strategy in the first level of the game. This way, 
the base-station can rely on the rational and selfish nature of the multiple access players to influence their 
behavior in the second stage (i.e., low level game). 

In this work, we consider a class of successive decoding strategies parameterized by the decoding order 
as a function of the fading gains (hi, h 2 ). More precisely, the base-station divides the whole possible space 
of (hi, h 2 ) into two subsets Di, D{. When (hi, h 2 ) £ D i, the base-station will decode user l’s information 
first whereas (hi, h 2 ) £ D\ implies decoding user 2’s signal first. After the base-station announces its 
strategy, i.e., Di, the multiple access users play the low level game using the Nash equilibrium concept. 
The strategy space of user i is still T,, and the payoff function of user % is defined as the supremum of 
the achievable rate. Here supremum refers to the fact that in the rate expressions to follow we always 
assume the users to be decoded successfully (which is a critical assumption in the successive decoding 
approach). We will show later that, at the Nash equilibrium this condition indeed holds. Hence, the 
supremum corresponds exactly to the achieved payoff. With a slight abuse of notation, the payoff function 
of each user is written as 


Ri(D u Vi,V 2 ) 

R^(Di, VijVfl) 




_ Vi(hi, h 2 )hi _ 

& 2 + ^(hi, (i2)^2-T[(/ii,/i2)e.Di}, 

_ ^(hi, h 2 )h 2 _ 

o 2 + R\(hi, h 2 )hiI{(h lt h 2 )eD$} 


f (hi, h 2 )dhidh 2 , 
f(hi, h 2 )dhidh 2 . 


(17) 


Here /{.} is the indication function. In order to achieve the average rate in (IT71) . for a given base-station 
strategy Di, each user will use two code-books. The low rate codebook is multiplexed across the fading 
states in which the user is decoded first and the high rate codebook is multiplexed across the other fading 
states. The payoff function of the base-station is defined as 


HiRi(Di, Vi, V 2 ) + ji 2 R 2 (Di, “P i, V- 2 ). 


(18) 


This payoff function has a natural economical interpretation as the revenue of the base-station where /i, can 
be viewed as the payment that user % owes per unit rate. The value of ji l can be decided using an auction 
process [16], where each user submits its proposed payment /di to the base-station in order to maximize its 
own utility. In this work, we do not consider this auction process and assume that = [// 1 , /i 2 ] t is given. 

We first study the properties of the low level game. The Nash equilibrium under a fixed base-station 
strategy D x is a power control pair (V*, V 2 ) that satisfies 

Ri(Di,V{,VZ) > Ri(Di,V'i,V 2 ), W'i £ Ti, 

R 2 (Di,Vl,V* 2 ) > R 2 (Di,V*i,V 2 ),VV 2 eF 2 . 


For any given power control policy V 2 , the optimal power control policy of user 1 is the solution to the 
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following optimization problem 


max Ri(D 1 ,Vi,V 2 ) 


II 

II 


Vi(hi, h 2 )h\ 

+ Vi(h\i h 2 )h 2 I ^{(/ii,/i 2 )e-Di} 

Vi(hi, h 2 )f{h 1 , h 2 )dh 1 dh 2 < P 1 , 



f(hi, h 2 )dhidh 2 , 


s.t. 


( 19 ) 


Vi(hi, h 2 ) > 0. 


The optimal power control policy of user 2 is also the solution to a similar optimization problem for 


any power control policy of user 1. For a given Di, the solution set for this low level game is written 
as S(Df) = {(Vi,V 2 ) : ( Ti,V 2 ) is a Nash equilibrium of the low level game}. The following result 
characterizes the pure-strategy Nash equilibria of our low level game. The algorithm developed in the 
proof is reminiscent of the iterative algorithm in [3,12]. 

Theorem 2 For any strategy D\ of the base-station, and any channel distribution, there exist Nash equi¬ 
libria for the low level distributed power/rate control game. 

Proof : At the Nash equilibrium, no user can benefit by deviating unilaterally. Suppose V 2 (h\, h 2 ) is 
given, user l’s strategy is the solution to (fl9l) . which is still the water-filling solution 



( 20 ) 


where Ai is the power level chosen to satisfy the power constraint of user 1 with equality. For the same 
reason, if we fix h 2 ), the optimal response of user 2 is also water-filling over the sum of the inter¬ 

ference from user 1 and the background noise, which is 



( 21 ) 


The key of our proof is to establish the existence of a pair (Ai, A 2 ) that simultaneously satisfies the 
two power constraints with equality, and hence, constitutes a Nash equilibrium. If such (Ai, A 2 ) exists, we 
have solutions to the equations (f20t and d27T) . One can easily check that if (hi, h 2 ) G D x , 



( 22 ) 
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Similarly, if (hi, h 2 ) E D\, 


V\(hi, h 2 ) 
V 2 (h\, h 2 ) 



( a 2 Vi(hi,h 2 )hA 

r " ^ h 2 ) 



( 23 ) 


Thus, if the water-filling level pair (Ai, A 2 ) exists, it should be the solution to the following equation 
array: 



D 1 



a 2 f\ 2 h 2 cx 2 \ 
hi V hi hi) 


+ 

f(hi, h 2 )dhidh 2 




+ 

f(hi, h 2 )dhidh 2 


Pi, 


(24) 







+ 

f(hi, h 2 )dhidh 2 




+ 

f(hi, h 2 )dhidh 2 




Before proceeding further, we first observe the following. If there are two pairs (A^, Ag) and (Ai, A 2 ), 
where A, > Ai,Aa = A 2 , then we have P\ ( A^ X) > A(Ai,A 2 ),P 2 (A' 1 ,A;) < P 2 (Ai,A 2 ) 3 . One can 
easily verify this by observing that V\ {h \, h 2 ) is a non-decreasing function of Ai and a non-increasing 
function of A 2 . At the same time, V 2 (hi, h 2 ) is a non-increasing function of Ai and a non-decreasing 
function of A 2 . Based on these observations, we have the following iterative method to solve (l24l) . Set 
Ai(l) = 0, A 2 (l) = 0, then fix A 2 and increase Ai until A(Ai, A 2 (l)) = A- This can be done by solving 
the following equation: 



D 1 



A _ f \ 2 (l)h 2 _ A A 
hi V hi hi) 


+ 

f(hi, h 2 )dhidh 2 




+ 

f (hi, h 2 )dhidh 2 


Pi- 


(25) 


Let Ai(2) represent the solution to this equation. At this time, we will have P 2 (Ai( 2), A 2 (l)) < P 2 . Then 
we can increase A 2 (l) to A 2 (2) such that P 2 ( Ai(2), A 2 (2)) = P 2 . After this step, A(Ai(2), A 2 (2)) < Pi, 

3 Here P,(\\. A 2 ) refers to the average power of user i when the users do water-filling according to the water levels (Ai, A 2 ). 
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thus we can increase Ai again. Through this process, we can get non-decreasing sequences Ai(ra), A 2 (n), 
and A(Ai(n), A 2 (n)) —> A, A(Ai(n), A 2 (n)) — > P 2 . Since A, A are limited, Ai(n),A 2 (n) are non¬ 
decreasing sequence with upper bounds. Then there exists constants A,, \ 2 such that: 

limA!(n) = At, A(A}, A^) = A 

n—>00 

lim A 2 (n) = A*, A(A;,A;) = A 

n—>oo 

This pair (At, Ag) is therefore a Nash equilibrium of our power allocation game. □ 


(26) 

(27) 


Theorem |2l only establishes the existence of a Nash equilibrium, but it tells nothing about the unique¬ 
ness of this equilibrium. To prove uniqueness, one is typically forced to find a contraction mapping whose 
fixed point is the Nash equilibrium. In [12,13], the authors apply this method to the interference game and 
find that uniqueness requires very restrictive conditions. Fortunately, we are able to prove uniqueness in 
our setup by using the concept of admissible Nash equilibrium (Definition 3.3 of [19]). 

Definition 2 A Nash equilibrium strategy pair (V*, V 2 ) is said to be admissible if there exists no other 
Nash equilibrium strategy pair (V[,V 2 ) such that R 1 (D \.'P\, V 2 ) > R] {D\. V\. : V 2 ), A(A, V\ . A) > 
A( A, Pf, A) an d at least one of these equalities is strict. 


Intuitively, this notion allows for eliminating Nash equilibria which are dominated by other equilibrium 
points. One would expect the rationality of the players to steer them away from such dominated equilibria, 
and hence, they will ultimately settle in one of the admissible points. This approach allows for modifying 
the solution set for our low level game to only include admissible Nash equilibria S*(D 1) = {(A, A) : 
(A, A) is an admissible Nash equilibrium of the low level game}. The following result establishes the 
existence of a single admissible Nash equilibrium in this set (for any choice of A) 


Theorem 3 For any strategy I)\ of the base-station, and any channel distribution function, there exists a 
single admissible Nash equilibrium for the low level power/rate allocation game (i.e., for any D\, S* (D \) 
is a singleton). 


Proof: If Di is the same as the region given by the Section I3TTI then the optimal solution is time-sharing, 
and the Nash equilibrium is unique (as established earlier). For other A, we establish uniqueness of the 
admissible Nash equilibrium by contradiction. 

We let (A[, Ag) and (A 1; A}) be the two pairs of water-levels corresponding to equilibria. Then, by 
definition, the two average power constraints are satisfied with equality with these two pairs of water- 
levels, that is A(A[, Ag) = A, A(A[, Ai*;) = P 2 , A (A(, A2) = A, A(A}, A2) = A- Noting that we are 
not at a time sharing point, we claim: 

1. If A} = A}, we have A(; = A 2 . If not, we will have A(A}, A 2 ) > A, A(A}, A 2 ) < A when A?; > X 2 
and A(Ai, A 2 ) < A, A(A}, A 2 ) > A when A^ < A^. Thus we come to a contradiction. 
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2. If A] < A'„ we have A 2 < X 2 . If not, we will have Pi(A , 1 , Ag) > Pi, P^Ai, \' 2 ) < P 2 when X 2 > A 2 . 
Thus we come to a contradiction. 

3. If A] > A',, we have X 2 > X 2 . If not, we will have Pi(X'i, X 2 ) < Pi, P 2 (X\, X ' 2 ) > P 2 when A£ < A 2 . 
Thus we come to a contradiction. 

The two water-level pairs, therefore, have a strict order. We can define the relationship < for the 
water-level pairs and say (A^A^) < (A^A^), if A] < A^ and X 2 < X' 2 . Suppose (A*, Aj) < (A^A^), 
we claim that Ri(Di, V*, V 2 ) > Ri(Di,V'i,V 2 ) and R 2 (Di,T{,V 2 ) > R 2 (Di,V'i,V'f). Without loss of 
generality, we only need to prove the first part. To show this, we can see that the sum of the interference 
from user 2 and the background noise is 

^(A 2 ) = <x 2 , 

if (hi, h 2 ) G D^, and 

N[(X 2 ) = a 2 + (X 2 h 2 - cr 2 ) + . (28) 

if (hi, h 2 ) G Di. 

Since our solution is not time sharing, we can see that 7V[ (A 2 ) is a decreasing function of A 2 . Thus 
Aj < X 2 implies that RRDi, V*, V 2 ) > RRDi, V[, V 2 ) and our claim is true. 

This claim means that the achievable utility pairs also have strict order, i.e., the smaller the water-filling 
pair, the larger the utility pair. With this strict order relationship among the achievable utilities at the Nash 
equilibria, the unique admissible Nash equilibrium is achieved with the minimum water-level pair. This 
completes the proof. □ 

An explicit approach for achieving the unique admissible equilibrium in our game is for all the users 
to follow the iterative algorithm used in the proof of Theorem |3 and agree off-line on the convention of 
starting the iteration with Ai(l) = A 2 (l) = 0. This agreement is clearly in the best interest of the two 
users, and hence, is consistent with the selfish behavior assumption. 

Now, we turn our attention to characterizing efficient base-station strategies. In the following we use 
ViD-i to refer to the unique power control policy of each user, under strategy Di, at the admissible Nash 
equilibrium. Here, we borrow the following definition from [19]. 

Definition 3 A strategy D* is called a Stackelberg equilibrium strategy for a given (pi, p 2 ), if 

R* = PiRi(D\, 7\d*, V 2 D*) + p 2 R2(D\, Vidi, V 2 D*) 

> piRi(Di,Vi Dl ,V 2Dl ) + p 2 R 2 (Di,Vi Dl ,'P 2Dl ), (29) 

for all D 1 . Moreover, for any e > 0, a strategy D* e is called an e-Stackelberg strategy if 

piRi(D\ e , Vidi e , P 2 D 1 J + P2R2(Dl e , PiD* e , P 2 D 1 J > R* — e. (30) 
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Corollary 1 For every pair (p\, p 2 ), 0 < < oo, 0 < p 2 < oo, an e-Stackelberg strategy exists. 

Proof : Based on Property 4.2 of [19], the only thing we need to prove is that R* is bounded. Define R° 
as the average rate the i\h user can get when the other user is absent, then 

R* = PiRi(D*, 7 \ d *, 7 \ d *) + P 2 R 2 (D*, Vid^i 7 \ d *) < PiR° + ^ 2 -^ 2 - ( 31 ) 

This completes the proof. □ 

Combining Theorem |3] and Corollary [TJ we see that the proposed Stackelberg game setup has a very 
desirable structure. For any given vector p, the existence of equilibrium is guaranteed and the optimal 
policy for every rational multiple access user in the low level game is unique. Therefore, the users will have 
no difficulty in deciding the power and rate levels in a distributed way. The following result characterizes 
the achievable performance of the proposed Stackelberg game. 

Theorem 4 LetQ s = ViD-y, 7\di), Ri(D\, 7\di))}- Then, Q s includes the three bound- 

Di 

ary points CR\, CR 2 , SP of the capacity region Q c . However, Q s does not include any other boundary 
points ofQ c . 

Proof : It is easy to verify that CR\ can be achieved by setting D\ = 0 , which means the base-station will 
always decode user 2’s signal first. The corresponding policy for user 1 is to water-fill over the background 
noise, while the optimal policy for user 2 is also water-filling but over the sum of the interference from user 
1 and the background noise. This is exactly the same as the centralized policy that achieves the boundary 
point CR\. Similarly CR 2 can be achieved by setting = 0 , and SP can be achieved by setting Di as 
the same region given in the Section l3Tl 

Now suppose that Q s includes another boundary point (Rib, Rat)- Without loss of generality, suppose 
that at this point // L > /i 2 and the corresponding optimal central policy is V b , TZ h . The partition region that 
achieves this point is given by D b . The corresponding admissible power control pair is V\ n h , V-id,,- It was 
shown in [3] that the power control policy that achieves any boundary point is unique. Thus if the partition 
D b achieves this point, at any fading state (hi, h 2 ), we have 

'PiD b (hi,h 2 ) = Vyb(hi,h 2 ), 

V2D b (hi,h 2 ) = V2,b(h\,h2). (32) 

Then at any fading state, the capacity region pentagons formed by these two policies are same, which 
is also shown on figure [3] For every fading state, the optimal rate control policy lZ b corresponds to the 
corner point XI. While for the distributed power control, when (hi, ho) G D, the operating point is X2, 
and when (hi, ho) G IP, the operating point is XI. Thus 

Rl(D,VlD,V2D) = -E{heD}[-Rl,Xl(h)] + f?{ h gD c }[-Rl,.Y2(h)] 

< -£'{her>}[-Ri,xi(h)] + -E{heD c }[-Ri,xi(h)] = R u , (33) 
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Figure 3: The capacity region of the Gaussian multiple access channel with fixed channel gains (hi, h 2 ). 


which is a contradiction. This show the non-existence of D that achieves any other boundary point of the 
capacity region Q c . □ 


Theorem |U shows that the introduction of the base-station as a leader of the game enlarged the achiev¬ 
able rate region (as compared to the Nash game discussed earlier) but this approach fails short of achieving 
the whole capacity region. Figure |4] compares the capacity region with the Stackelberg achievable rate re¬ 
gion assuming the following simple base-station strategy: when h\ < ah 2 the base-station decodes user 
1 first and when hi > ah 2 the base-station decodes user 2 first. Under this strategy, the rates at the 
Nash-equilibrium are: 


r» oo roih2 


Ri(a) = 


+ 


r CT 2 (Aoh.,- g 2)+ 2 
Ai - Ai 


1 / Xihi -a 2 - (A 2 h 2 - cr 2 ) + 

o lo S2 1 + 


r»oo /»oo 


'0 amax{o/i2,yj-} “ 


log 2 1 + 


a 2 + (A 2 h 2 - (J 2 )+ 

Ai/ii — a 2 ' 


f(hi, h 2 )dhidh 2 


u 


f(hi, h 2 )dhidh 2 , 


(34) 


R 2 (a) = 


r °° f “ 1, (, X 2 h 2 - a 2 - (Xihi - cr 2 ) + \ 

l°g2 1 3- o , ; -yrr- f(h, h 2 )dh l dh 2 


' o 


r»oo /*oo 


2 (Aj h t - j 3 )+ 2 


cr 2 + (Xihi - <t 2 ) + 




, 7: log 2 ( 1 + X2h2 a ) f(h u h 2 )dhidh 2 , 
hx zx\ z \ a 


(35) 
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where Ai, A 2 are the solutions to the following equations: 


r»oo pa /12 


In / o - 2 | (A 2 ^2-°' 2 ) + 

u ' x x 


A i 

r»oo poo 


Ai - — — 2 \ l2 ——) /(hi, h 2 )dhidh 2 


h 


+ 


’o J max{a/i2,T— } 


a 


Ai — — ) /(hi, h 2 )dhidh 2 = Pi, 


+ 


/ cr2 (A-] h-| -cr 2 ) + 
A 2 A 2 
r»oo poo 


1 0 «/max{ — ,y—} 

L o: ’ A2 J 


A 2 - + ) /(hi, h 2 )dhidh 2 


h 2 


cr 


A 2 — — ) /(hi, h 2 )dh\dh 2 — P 2 . 


(36) 



Figure 4: The equilibria points of the Stackelberg power/rate allocation game. 

It is easy to verify that CR\ is achieved by setting a = 0, CR 2 is achieved by setting a = oo, and SP 
is achieved by setting a = j §, where A", Af are the water-filling levels given in the Section 13711 One can 
also prove the following statement. 

Corollary 2 For the base-station that adopts the simple region partition strategy, there always exists 
a Stackelberg equilibrium solution for any pair (pi, p 2 ), if the channel gains are bounded and satisfy 
min(h 1 ) > 0,min(h 2 ) > 0. 

Proof : Since (h 1; h 2 ) are bounded, and min(h 1 ) > 0,min(h 2 ) > 0, then a £ [min(h 1 )/max(h 2 ), 
max(hi)/ min(h 2 )] is a compact set. And for every a, we have proved in Theorem^ S'*(a) is a singleton, 
thus based on [19], for any pair (p l5 p 2 ), there exists a Stackelberg equilibrium solution. □ 
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3.3 Repeated Game Formulation 

The inability of our Stackelberg game to achieve all the boundary points of the capacity region can be 
attributed to the structural difference between our successive decoding strategy and the optimal decoding 
strategy characterized in [3], In particular, the optimal decoding strategy will always decode user 1 first 
(i.e., for all channel states) if //1 < p 2 , whereas in our formulation the decoding order is a function of the 
channel state. Unfortunately, if we adopt any static decoding order, the game will always settle at one of 
the corner points of the capacity region as argued in the previous section. To solve this problem, we pursue 
our last resort of replacing the static game formulation with a dynamic one. 

The static formulation assumes that the players interact with each other only once. This assumption 
models the case where the topology of the network changes quickly. In a more slowly varying environ¬ 
ments, a dynamic game formulation seems more appropriate. Specifically, we call a game where the 
players interact for T > 1 instances a dynamic game 4 . An example of a dynamic game is the repeated 
game where the same static game is played many times. Obviously, the users can play this game by repeat¬ 
ing the same static strategy [18]. But, the advantage of the repeated game framework is that the players 
can do better than just repeating the same static strategy. The idea is that, since the players will interact 
with each other many times, they can learn each other’s strategies, which may allow them to cooperate to 
obtain higher payoffs. In this case, the players can start cooperating and if one player deviates from the 
cooperation phase, the other players will adjust their strategies to punish the deviating player. The punish¬ 
ment threat is credible only if the deviating player achieves a lower payoff under punishment as compared 
with the cooperating phase. Under these circumstances, the users will have no desire to deviate from the 
cooperation phase, thus all the users can achieve higher utilities as compared to the static scenario. 

In the repeated game, the utility of each player can be defined as as a discounted sum of the payoff 
achieved in each stage. We refer to the discount factor by 5, where 0 < 5 < 1. The larger 5 is the more 
patient the player is. In the proof of the following theorem, we use a generalized version of a result due to 
Aumann and Shapley [18] [27] and define the payoff of the repeated game as the time-average of payoff 
at each stage. 

Theorem 5 As T —> oo, all the boundary points of the capacity region are achievable under the repeated 
game setup with the base-station as the game leader. Moreover, the corresponding equilibria are subgame 
perfect. 

Proof: In order to prove our claims, we need to construct a subgame perfect strategy that achieves every 
boundary point. Consider the following strategy: The base-station announces its rate award vector p,. then 
the game proceeds in the following way: 

1. t — 1, each user uses the optimal centralized control policy V c and rate control policy 7 Z c that 
maximize p,R,. Under this point, each user gets a rate R,. 

4 We note that every game stage is assumed long enough to justify invoking the ergodic assumption within every stage. 
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2. if user 1 deviates from the centralized control policy at stage t = t fh then the base-station will punish 
user 1 by moving to the corner point CR 2 for T\ periods (i.e., decoding user 1 first for T\ stages). 
The parameter Tj is chosen such that 

Ti T\ 

Rl,CRi + ^ Ri,cr 2 < E Ri- (37) 

8=1 i= 1 

After Ti periods, the players return to the cooperative phase. If user 2 deviates, the base-station can 
also punish it for T 2 phases, which can be chosen in a similar way, by moving to the corner point 
CR X . 

The conditions on T, ensures that any gain obtained from deviating is removed at the punishment phase, 
so no sequence of a finite or infinite number of deviations can increase user V s payoff. Moreover, although 
it is costly for the base-station to carry out the punishment, any finite number of such losses are costless in 
the long run. This proves the subgame perfection of the strategy. □ 


3.4 Arbitrary Number of Users 

All our results generalize naturally to the N user channel except for Theorem 0 The arguments used in 
the proof do not carry over for N > 3, and hence, we can not guarantee the uniqueness of the admissible 
Nash equilibrium. However, if the multiple-access users choose the Nash equilibrium corresponding to 
the iterative algorithm used in the proof with A = 0, then the rest of our results in Section lT2l hold. The 
base-station can announce this initial condition in the first stage of the Stackelberg game. All users will 
be forced to follow this strategy since any deviation can result in the catastrophic event of unsuccessful 
decoding. 

For the sake of completeness, we detail in this section the generalization of our Nash game. The 
other scenarios follow virtually the same lines, and hence, are omitted for brevity. We first restate our 
assumption that all the users are informed a-priori of all the CSI. This is exactly the same assumption 
used in [4-9], and corresponds to a game with complete information. In the Nash formulation, every user 
treats the signals from other users as noise. The optimal power control policy of each user is to water-fill 
over the sum of the interference and the background noise, i.e., 


Vi( h) 



N 


(X 2 + 


E h,V,( h). 

3 = _ 

hi 


(38) 


Each user will adjust its water level depending on the levels of the other users. At the Nash equilibrium 
points the water levels A*, i = 1, • • • , N satisfy all power constraints with equality. In order to show 
that the only Nash equilibrium of this game is the maximum sum-rate point, we generalize the proof of 
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Theorem [I] In particular, we show that at the equilibrium only one user will transmit at any fading state 
then it is easy to verify that the power control policy of each user at the equilibrium is exactly the same as 
the corresponding central policy for the point SP. Without loss of generality, suppose that users 1 to M 
are transmitting simultaneously at certain fading states, then for each transmitting user, we have 


M 


+ E hjVj 

Pi+-—- = A l5 


Vi 


V, 


M 


1 + - 

h\ 


M 

cr 2 + 

E hj-Pj 


3= IjA* 


hi 


M -1 

a 2 

+ hjVj 

+ — 

3 = 1 




= A 


Al¬ 


t'M 


(39) 


These conditions imply that Aj hi = Xjhj,Wi, j = 1, • • • , M. With continuous probability density func¬ 
tions, this happens with probability zero. Then with probability one, at any fading state only one user will 
transmit. If user i transmits, the sum of background noise and the signal of user i should be larger than the 
water level of user j, and hence, h, should satisfy 



hi cr 

hj hj 


Ai-r 1 > A j,Wj ^ i. 

hj 


(40) 


4 Vector Channels 

Thus far, we have presented our results for the scalar channel where the base-station is only equipped with 
one receive antenna. In this section, we extend our study to the vector multiple access channel where the 
base-station is equipped with N r receive antennas. Our goal is to see if our previous conclusions carry 
through or not. Again to simplify the presentation, we focus on the two user scenario. The signal received 
at any time n is given by 

2 

y(n ) = ^ h i(n)xi(n) + z (n), (41) 

i=l 

where h, ; (n) = [a fhu, y/hii, ■ ■ ■ , \fh Nri ] T is the N r x 1 fading vector from user i to the N r receive 
antennas. As before, we assume that the fading processes have a joint continuous distribution with a 
bounded density. z(n) is the gaussian noise vector at the N r receive antenna with correlation matrix 
E[ zz T ] = a 2 l Nr . 

Similar to the scalar channel case, we first consider the static Nash formulation where the only players 
of the game are the multiple access users. The strategy space of user i is still T, = {V-, : Eu[Vi] < 
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Pi,Vi( H) > 0} with H = [hi,h 2 ]. The payoff function of user i is still the average achievable rate 
Ri = Eu[JZi\. It is easy to see that for any power control strategy P 2 (hi, h 2 ) of user 2, the optimal power 
control policy of user 1 is the solution to the following optimization problem 


max R\ 
Vi 


E 


H 


1 


log 2 det ( a 2 lN r + Pi (hi, h 2 )hitrf + P 2 (hi, h 2 )h 2 h 


log 2 (det(cr 2 l Nr + P 2 (hi, h 2 )h 2 h 


s.t. Pi (hi, h 2 ) G T\. 


(42) 


Given any power control strategy Pi (hi, h 2 ) of user 1, the optimal power control strategy of user 2 is a 
solution to a similar problem. The difference between the vector and scalar channels is highlighted in the 
following result. 


Theorem 6 There exists a unique Nash equilibrium for the distributed power/rate allocation game in the 
vector multiple access channel. At this equilibrium, the power control policy of each user is the same as 
the central policy that achieves the maximum sum-rate point SP. The achieved rates, however, are strictly 
smaller than the rates corresponding to SP. 


Proof: Given the power control policy P 2 (hi,h 2 ), it is easy to see that P H 


|log 2 (det(a 2 Ijv r + 


P 2 (hi, h 2 )h 2 h 2 


is a constant, thus the solution to the optimization problem (l42l ) is the same as the 


solution to the following optimization problem 

'1 


max /(Pi) = E n 

Vi 


log 2 det I <j L Nr + Pi (hi, h 2 )hihi + P 2 (hi, h 2 )h 2 h 2 


s.t. Pi(hi, h 2 ) e T\. 


(43) 


Since cr 2 Ijv r + Pi (hi, h 2 )h x hf + P 2 (hi, h 2 )h 2 h^ is positive definite, and the log 2 (det(.)) function 
is concave in the set of positive definite matrices, then the objective function is concave in the set of 
power allocation policies. The constraint set is convex and it is easy to verify that the Slater’s condition is 
satisfied. Hence, there exists a constant 71 , such that the solution to sa is the same as the solution to the 
following optimization problem: 


max Li(Pi(hi,h 2 ), 7 i) 



7i^H[Pi(hi, 


det ^cr 2 Ijv r + Pi (hi, h 2 )hih^ 4 - P 2 (hi, h 2 )h 2 h^ 

h 2 )] 


(44) 


The KKT necessary and sufficient conditions of this optimization problem is 

-1 

hi - 71 = 0. 


dLi 


&Pi 


= Irf ( ct 2 !^ + Pi (hi, h 2 )hihf 4 - P 2 (hi, h 2 )h 2 h 7 


7i > 0. 


(45) 
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Using the matrix inversion lemma [23] 


(A + xxj * 1 = A 


—1 \ —1 A-WA " 1 


1 + x*A -1 x’ 


we come to 


hf cr 2 l Nr + V 2 { hi, h 2 )h 2 h^ hi 


-l 


<91/! 

jjp / \ -1 

1 + hf ( a 2 I Nr + "P 2 (hi, h 2 )h 2 h^ ) hiP^hi, h 2 ) 


7i = °> 


7i > 0. 


Considering the condition Pi(hi, h 2 ) > 0, we get 


(46) 


(47) 


Pi(hi,h 2 ) = A! 


-1 I ’ 

hf( cr 2 Iiv r + P 2 (h 1 ,h 2 )h 2 hf ) h r 


(48) 


where Ai = 4- is a constant that satisfies the average power constraint of user 1 with equality. Similarly, 


given Pi(h l5 h 2 ), we get the following optimality condition 


h 2 ( cr Ijv r + Pi(hi, h 2 )h 1 h 1 + P 2 (h 1; h 2 )h 2 h 2 ) h 2 — y 2 — 0, 

72 > 0 . 


-1 


(49) 


The optimal policy of user 2 is therefore 


^ 2 (hi, h 2 ) — I A 2 — 


hH a 2 I Nr + 'Pi(hi,h 2 )hihf h : 


-1 


(50) 


where A 2 is the constant that satisfies the average power constraint of user 2 with equality. Applying the 
results of [23] to the fading multiple access channel with N r receive antennas, we know that are 

exactly the optimality conditions for the following optimization problem 


max Rsum{'P 1 ,V 2 ) = En[R,i + 1Z 2 ] 

Vl,V2 


= E 


H 


1, ( A J T , Pi(hi,h 2 )hihf + P 2 (hi,h 2 )h 2 hf 

2 lo §2 ( det(Ijv r +- ~ 2 - 


s.t. ^(hi,^) G T i,P 2 (hi,h 2 ) G T 2 . 


(51) 


One can easily verify that the optimization problem d5Tl) will maximize the sum-rate at the base-station. 
This means the optimal policy of each user aiming to maximize its own rate while treating the signal of 
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the other user as interference is exactly the same as the power control policy that maximizes the sum-rate 
at the base-station. A similar observation has been made in the Gaussian multiple access channel in [24]. 

Therefore, we can apply the following iterative process to get the power control policy at the Nash 
equilibrium point. Starting at V\ — 0. V> — 0, each user takes a turn to water-fill over the combined 
interference and the background noise. At each step, the objective function of (ITil increases. But with 
limited average power at the users, the objective function ( f5TT) has an upper-bound. Thus, this process 
will converge, which means the Nash equilibrium exists. At the convergence point, the optimality con¬ 
ditions S3 m hold, which means the power control policy of each user at the Nash equilibrium is the 
same as the optimal policy that maximizes the sum-rate at the base-station. The uniqueness of the power 
control policy that maximizes the sum-rate [23] implies the uniqueness of the Nash equilibrium point. 
This proves our first two claims. 

From [23], we know the optimal central control policy is not time-sharing. Hence, in some channel 
fading states, the transmission power of both users will be larger than zero. In these cases, the capacity 
region pentagon is shown in Figure |5] We can easily see that the central rate control policy will always 



Figure 5: The capacity region pentagon for fixed channel gains. 

operate on one of the boundary points (the line between XI and A"2), but the distributed scheme will 
always choose the point X3. We have either 

^H^liv] < Eu[Rl,sum] (52) 

or 

En[R'2 n] < E li [R 2i sum]- (53) 

This completes the proof. □ 
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Theorem |H] contrasts the scalar scenario, where the Nash equilibrium rate is the same as the maximum 
sum-rate. The reason is that in the scalar multiple-access channel, the strategy that maximizes the sum-rate 
is time sharing. In the vector case, on the other hand, we have min (TV, Ay) degrees of freedom, and hence, 
more than one user are allowed to transmit at any fading state. The central control policy will choose 
to operate at one of the boundary points, but because of the interference, the multiple access users will 
distributively choose a point that is strictly inside the capacity region at the Nash equilibrium point. 

Our Stackelberg game can also be extended to the vector multiple access channel. Similar to the scalar 
case, the base-station partitions the space of (hi, h 2 ) into two region D 1; D \, and decodes user 1 first in 
Di and decode user 2 first in the region A)). The following results do not depend on the specific choice of 
Di. The strategy space of user i is still Ty, and the payoff function of each user is still the supremum of 
achievable average rate. 

Theorem 7 There exists a unique admissible Nash equilibrium for the low level game. The Stackelberg 
game achieves the two corner points of the capacity region but doesn’t achieve the maximum sum-rate 
point. 

Proof: The proof of the existence of a unique admissible Nash equilibrium under any base-station strategy 
follows essentially the same lines as the proofs of Theorems |2] and 0 The only additional requirement is 
to prove that V\ (hi, h 2 ) is a non-decreasing function of Ai and a non-increasing function of A 2 . 

Based on the proof of Theorem |6l we know that the optimal power control policy of user 1 is 



It is easy to verify that V\ (hj, h 2 ) is a non-decreasing function of Ai. To show that V\ (h b h 2 ) is a 



-l 


non-increasing function of A 2 , we only need to show that hf ^cr 2 lAy + h 2 h^ ^A 2 — p^p 
non-increasing function of A 2 . 


hi is a 


Using the matrix inversion lemma (l46l) . we have 



(55) 


in which 



+ 


( 56 ) 
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It is easy to verify that g( A 2 ) is a non-decreasing function of A 2 , thus we come to the conclusion that 
Pi (hi, h 2 ) is a non-decreasing function of Ai and a non-increasing function of A 2 . To achieve the comer 
points, the base-station can just set D x to be the whole set, in one case, and the empty set in the other case. 

We prove the nonexistence of a base-station strategy that achieves the sum-rate point by contradiction. 
Suppose that a partition D\ achieves the sum-rate point. Since the unique power control policy that 
achieves the maximum sum-rate point is to water-fill over the sum of the interference and the background 
noise for both users, then in the region D i, user 1 should stop sending. Because in this region, the optimal 
distributed power control policy of user 2 is to water-fill only over the background noise. Similarly, in the 
region D±, user 2 should also stop sending. Then we come to a time-sharing solution, which cannot 
achieve the maximum sum-rate point and we have our contradiction. □ 

Finally, if the users have the opportunity to interact many times then any boundary point of the capacity 
region of the vector multiple access channel can be achieved as a subgame perfect equilibrium. Moreover, 
the users can use the same strategies developed in Theorem[5]to achieve these boundary points. 

5 Conclusions 

This paper has developed a game theoretic framework for distributed resource allocation in fading mul¬ 
tiple access channels. In our first result, we showed that the opportunistic communications principle can 
be obtained as the unique Nash equilibrium of a water-filling game. By introducing the base-station as a 
player, we were able to achieve all the comer points of the capacity region, in addition to the sum-rate op¬ 
timal point, distributively. In slow varying environments, where the multiple access users can be assumed 
to interact many times, the repeated game formulation was shown to achieve all the boundary points of the 
capacity region. Finally, we elucidated the limitations of our game theoretic framework in vector multiple 
access channels. 

An interesting avenue for future work is to further investigate the practical aspects of our framework. 
For example, a natural extension is to consider the case with partial and/or distorted channel state infor¬ 
mation by borrowing tools from game theory with imperfect information. 
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